Hilbert Envelope Based Features for Far-Field Speech Recognition

نویسندگان

  • Samuel Thomas
  • Sriram Ganapathy
  • Hynek Hermansky
چکیده

Automatic speech recognition (ASR) systems, trained on speech signals from close-talking microphones, generally fail in recognizing far-field speech. In this paper, we present a Hilbert Envelope based feature extraction technique to alleviate the artifacts introduced by room reverberations. The proposed technique is based on modeling temporal envelopes of the speech signal in narrow sub-bands using Frequency Domain Linear Prediction (FDLP). ASR experiments on far-field speech using the proposed FDLP features show significant performance improvements when compared to other robust feature extraction techniques (average relative improvement of 43% in word error rate).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modulation Spectrum Analysis for Recognition of Reverberant Speech

Recognition of reverberant speech constitutes a challenging problem for typical speech recognition systems. This is mainly due to the conventional short-term analysis/compensation techniques. In this paper, we present a feature extraction technique based on modeling long segments of temporal envelopes of the speech signal in narrow sub-bands using frequency domain linear prediction (FDLP). FDLP...

متن کامل

Fepstrum Features: Design and Application to Conversational Speech Recognition

In this paper, we present the Fepstrum features – a principled approach to estimate the modulation spectrum of the speech signals using the Hilbert envelopes in a nonparametric way. The importance of the modulation spectrum as a feature in the automatic speech recognition (ASR) has long been established by several researchers in the past twothree decades. However, traditionally, in the speech r...

متن کامل

Hilbert envelope based spectro-temporal features for phoneme recognition in telephone speech

In this paper, we present a spectro-temporal feature extraction technique using sub-band Hilbert envelopes of relatively long segments of speech signal. Hilbert envelopes of the sub-bands are estimated using Frequency Domain Linear Prediction (FDLP). Spectral features are derived by integrating the sub-band Hilbert envelopes in short-term frames and the temporal features are formed by convertin...

متن کامل

Envelope-based inter-aural time difference localization training to improve speech-in-noise perception in the elderly

Background: Many elderly individuals complain of difficulty in understanding speech in noise despite having normal hearing thresholds. According to previous studies, auditory training leads to improvement in speech-in-noise perception, but these studies did not consider the etiology, so their results cannot be generalized. The present study aimed at investigating the effectiveness of envelope-b...

متن کامل

EEMD-Based Speaker Automatic Emotional Recognition in Chinese Mandarin

Emotion feature extraction is the key to speech emotional recognition. And ensemble empirical mode decomposition(EEMD) is a newly developed method aimed at eliminating emotion mode mixing present in the original empirical mode decomposition(EMD). To evaluate the performance of this new method, this paper investigates the effect of a parameters pertinent to EEMD: speech emotional envelope. First...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008